AITopics

Country:

Asia > India > Karnataka > Bengaluru (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.46)
Instructional Material (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsFeb-16-2026, 02:19:40 GMT

9a8eb202c060b7d81f5889631cbcd47e-Paper-Conference.pdf

artificial intelligence, bayesian inference, machine learning, (18 more...)

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report (0.68)

Industry: Health & Medicine (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Neural Information Processing SystemsOct-9-2025, 02:37:25 GMT

9bf1962c5b65a243ee243bb03ff2c506-Supplemental-Conference.pdf

artificial intelligence, gb-svgd, machine learning, (16 more...)

Country:

Asia > India > Karnataka > Bengaluru (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.46)
Instructional Material (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsOct-9-2025, 02:28:39 GMT

Kernel Stein Discrepancy thinning: a theoretical perspective of pathologies and a practical fix with regularization Clément Bénard 1 Brian Staber 1 Sébastien Da Veiga 2 1

Stein thinning is a promising algorithm proposed by Riabiz et al. [2022] for post-processing outputs of Markov chain Monte Carlo (MCMC). The main principle is to greedily minimize the kernelized Stein discrepancy (KSD), which only requires the gradient of the log-target distribution, and is thus well-suited for Bayesian inference.

artificial intelligence, bayesian inference, machine learning, (18 more...)

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report (0.68)

Industry: Health & Medicine > Diagnostic Medicine (0.42)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)

Melcher, Moritz, Weissmann, Simon, Wilson, Ashia C., Zech, Jakob

Adaptive Kernel Selection for Stein Variational Gradient Descent

arXiv.org Machine LearningOct-3-2025

A central challenge in Bayesian inference is efficiently approximating posterior distributions. Stein Variational Gradient Descent (SVGD) is a popular variational inference method which transports a set of particles to approximate a target distribution. The SVGD dynamics are governed by a reproducing kernel Hilbert space (RKHS) and are highly sensitive to the choice of the kernel function, which directly influences both convergence and approximation quality. The commonly used median heuristic offers a simple approach for setting kernel bandwidths but lacks flexibility and often performs poorly, particularly in high-dimensional settings. In this work, we propose an alternative strategy for adaptively choosing kernel parameters over an abstract family of kernels. Recent convergence analyses based on the kernelized Stein discrepancy (KSD) suggest that optimizing the kernel parameters by maximizing the KSD can improve performance. Building on this insight, we introduce Adaptive SVGD (Ad-SVGD), a method that alternates between updating the particles via SVGD and adaptively tuning kernel bandwidths through gradient ascent on the KSD. We provide a simplified theoretical analysis that extends existing results on minimizing the KSD for fixed kernels to our adaptive setting, showing convergence properties for the maximal KSD over our kernel class. Our empirical results further support this intuition: Ad-SVGD consistently outperforms standard heuristics in a variety of tasks.

ad-svgd, particle, stein variational gradient descent, (11 more...)

2510.02067

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.62)

Neural Information Processing SystemsAug-19-2025, 04:36:20 GMT

Supplementary material Aggregated Goodness of fit Test A Assumptions

Stein's methods (Stein, 1972) have been widely used in the machine learning and

artificial intelligence, bandwidth, machine learning, (18 more...)

Country: Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.34)

Balasubramanian, Krishnakumar, Banerjee, Sayan, Ghosal, Promit

Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent

arXiv.org Machine LearningSep-12-2024

We provide finite-particle convergence rates for the Stein Variational Gradient Descent (SVGD) algorithm in the Kernel Stein Discrepancy ($\mathsf{KSD}$) and Wasserstein-2 metrics. Our key insight is the observation that the time derivative of the relative entropy between the joint density of $N$ particle locations and the $N$-fold product target measure, starting from a regular initial distribution, splits into a dominant `negative part' proportional to $N$ times the expected $\mathsf{KSD}^2$ and a smaller `positive part'. This observation leads to $\mathsf{KSD}$ rates of order $1/\sqrt{N}$, providing a near optimal double exponential improvement over the recent result by~\cite{shi2024finite}. Under mild assumptions on the kernel and potential, these bounds also grow linearly in the dimension $d$. By adding a bilinear component to the kernel, the above approach is used to further obtain Wasserstein-2 convergence. For the case of `bilinear + Mat\'ern' kernels, we derive Wasserstein-2 rates that exhibit a curse-of-dimensionality similar to the i.i.d. setting. We also obtain marginal convergence and long-time propagation of chaos results for the time-averaged particle laws.

convergence, kernel, ksd, (12 more...)

2409.08469

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.62)

Bénard, Clément, Staber, Brian, Da Veiga, Sébastien

Kernel Stein Discrepancy thinning: a theoretical perspective of pathologies and a practical fix with regularization

arXiv.org Machine LearningOct-26-2023

Stein thinning is a promising algorithm proposed by (Riabiz et al., 2022) for post-processing outputs of Markov chain Monte Carlo (MCMC). The main principle is to greedily minimize the kernelized Stein discrepancy (KSD), which only requires the gradient of the log-target distribution, and is thus well-suited for Bayesian inference. The main advantages of Stein thinning are the automatic remove of the burn-in period, the correction of the bias introduced by recent MCMC algorithms, and the asymptotic properties of convergence towards the target distribution. Nevertheless, Stein thinning suffers from several empirical pathologies, which may result in poor approximations, as observed in the literature. In this article, we conduct a theoretical analysis of these pathologies, to clearly identify the mechanisms at stake, and suggest improved strategies. Then, we introduce the regularized Stein thinning algorithm to alleviate the identified pathologies. Finally, theoretical guarantees and extensive experiments show the high efficiency of the proposed algorithm. An implementation of regularized Stein thinning as the kernax library in python and JAX is available at https://gitlab.com/drti/kernax.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2301.13528

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)

Das, Aniket, Nagaraj, Dheeraj

Provably Fast Finite Particle Variants of SVGD via Virtual Particle Stochastic Approximation

arXiv.org Machine LearningOct-5-2023

Stein Variational Gradient Descent (SVGD) is a popular variational inference algorithm which simulates an interacting particle system to approximately sample from a target distribution, with impressive empirical performance across various domains. Theoretically, its population (i.e, infinite-particle) limit dynamics is well studied but the behavior of SVGD in the finite-particle regime is much less understood. In this work, we design two computationally efficient variants of SVGD, namely VP-SVGD and GB-SVGD, with provably fast finite-particle convergence rates. We introduce the notion of virtual particles and develop novel stochastic approximations of population-limit SVGD dynamics in the space of probability measures, which are exactly implementable using a finite number of particles. Our algorithms can be viewed as specific random-batch approximations of SVGD, which are computationally more efficient than ordinary SVGD. We show that the $n$ particles output by VP-SVGD and GB-SVGD, run for $T$ steps with batch-size $K$, are at-least as good as i.i.d samples from a distribution whose Kernel Stein Discrepancy to the target is at most $O\left(\tfrac{d^{1/3}}{(KT)^{1/6}}\right)$ under standard assumptions. Our results also hold under a mild growth condition on the potential function, which is much weaker than the isoperimetric (e.g. Poincare Inequality) or information-transport conditions (e.g. Talagrand's Inequality $\mathsf{T}_1$) generally considered in prior works. As a corollary, we consider the convergence of the empirical measure (of the particles output by VP-SVGD and GB-SVGD) to the target distribution and demonstrate a double exponential improvement over the best known finite-particle analysis of SVGD. Beyond this, our results present the first known oracle complexities for this setting with polynomial dimension dependence.

artificial intelligence, gb-svgd, machine learning, (16 more...)

2305.17558

Country:

Asia > India > Karnataka > Bengaluru (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)